14 research outputs found

    Deploying AI Frameworks on Secure HPC Systems with Containers

    Full text link
    The increasing interest in the usage of Artificial Intelligence techniques (AI) from the research community and industry to tackle "real world" problems, requires High Performance Computing (HPC) resources to efficiently compute and scale complex algorithms across thousands of nodes. Unfortunately, typical data scientists are not familiar with the unique requirements and characteristics of HPC environments. They usually develop their applications with high-level scripting languages or frameworks such as TensorFlow and the installation process often requires connection to external systems to download open source software during the build. HPC environments, on the other hand, are often based on closed source applications that incorporate parallel and distributed computing API's such as MPI and OpenMP, while users have restricted administrator privileges, and face security restrictions such as not allowing access to external systems. In this paper we discuss the issues associated with the deployment of AI frameworks in a secure HPC environment and how we successfully deploy AI frameworks on SuperMUC-NG with Charliecloud.Comment: 6 pages, 2 figures, 2019 IEEE High Performance Extreme Computing Conferenc

    Model-Agnostic Federated Learning

    Full text link
    Since its debut in 2016, Federated Learning (FL) has been tied to the inner workings of Deep Neural Networks (DNNs). On the one hand, this allowed its development and widespread use as DNNs proliferated. On the other hand, it neglected all those scenarios in which using DNNs is not possible or advantageous. The fact that most current FL frameworks only allow training DNNs reinforces this problem. To address the lack of FL solutions for non-DNN-based use cases, we propose MAFL (Model-Agnostic Federated Learning). MAFL marries a model-agnostic FL algorithm, AdaBoost.F, with an open industry-grade FL framework: Intel OpenFL. MAFL is the first FL system not tied to any specific type of machine learning model, allowing exploration of FL scenarios beyond DNNs and trees. We test MAFL from multiple points of view, assessing its correctness, flexibility and scaling properties up to 64 nodes. We optimised the base software achieving a 5.5x speedup on a standard FL scenario. MAFL is compatible with x86-64, ARM-v8, Power and RISC-V.Comment: Published at the EuroPar'23 conference, Limassol, Cypru

    Qualitative assessment of oesophageal cancer metabolic tumour volumes delineated by an artificial intelligence algorithm

    Get PDF
    Qualitative assessment of oesophageal cancer metabolic tumour volumes delineated by an artificial intelligence algorithm Year: 2020 Session type: E-poster/poster Theme: Big data and AI Craig Parkinson, Kieran Foley, Shailen Sobhee, Walter Riviera, Salvatore Berenato, Costas Stylianou, Tom Crosby, Emiliano Spezi Abstract Background Incidence of oesophageal cancer is rising. Radiotherapy is increasingly used to treat this poor prognosis disease but requires significant resources to plan treatment. Therefore, automated methods would be preferred. Quantitative analysis of artificial intelligence (AI) algorithms is often reported, but qualitative evaluation is lacking. We investigated observers ability to differentiate manual versus a fully automated AI algorithm for outlining metabolic tumour volume (MTV) using a Turing test, including inter and intra-observer variability. Method Five radiologists (Ob1 to Ob5), independently observed 580 contours. 256 contours were delineated using a U-Net deep learning (DL) model and 324 were delineated manually. Observers decided whether the contour had been created with a DL method, manually, or if they were unable to tell. Of the 580 contours, 37 contours were repeated twice. Observers were blinded to the method and presented with a co-registered PET/CT, with a contour overlay. CT imaging was windowed to a window width of 330 Hounsfield units (HU) and window centre of -10 HU. Results Overall, Ob1 to Ob5 correctly identified 165 (28.4%), 199 (51.6%), 190 (32.8%), 181 (31.2%) and 193 (33.3%) out of 580 cases, respectively. Ob1 to Ob5 identified 202 (78.9%), 199 (77.7%), 159 (62.1%), 189 (73.8%) and 143 (55.9%) of 256 DL contours as being manually delineated. In repeat imaging, Ob1 changed opinion in 9 cases, Ob2 10 cases, Ob3 10 cases, Ob4 7 and Ob5 8 cases. On average observers changed opinion in 9 cases (21.6%) with a minimum of 7 (18.9%) cases and a maximum of 10 cases (27.0%). Observers on average identified 178.4 (69.6%) of the DL contours as being delineated manually (range; minimum 143 cases (55.8%) and maximum of 202 (78.9%) cases). Conclusion We have shown that Turing tests provide an additional method for qualitative evaluation that complements quantitative metrics, to assess AI algorithm performance in outlining metabolic tumour volumes. In our study, observers were unable to confidently determine the delineation method suggesting a strong performance of the AI algorithm. However, observer selection is subject to inter and intra-observer variability and potentially impacted by clinical experience

    TRY plant trait database – enhanced coverage and open access

    Get PDF
    Plant traits - the morphological, anatomical, physiological, biochemical and phenological characteristics of plants - determine how plants respond to environmental factors, affect other trophic levels, and influence ecosystem properties and their benefits and detriments to people. Plant trait data thus represent the basis for a vast area of research spanning from evolutionary biology, community and functional ecology, to biodiversity conservation, ecosystem and landscape management, restoration, biogeography and earth system modelling. Since its foundation in 2007, the TRY database of plant traits has grown continuously. It now provides unprecedented data coverage under an open access data policy and is the main plant trait database used by the research community worldwide. Increasingly, the TRY database also supports new frontiers of trait‐based plant research, including the identification of data gaps and the subsequent mobilization or measurement of new data. To support this development, in this article we evaluate the extent of the trait data compiled in TRY and analyse emerging patterns of data coverage and representativeness. Best species coverage is achieved for categorical traits - almost complete coverage for ‘plant growth form’. However, most traits relevant for ecology and vegetation modelling are characterized by continuous intraspecific variation and trait–environmental relationships. These traits have to be measured on individual plants in their respective environment. Despite unprecedented data coverage, we observe a humbling lack of completeness and representativeness of these continuous traits in many aspects. We, therefore, conclude that reducing data gaps and biases in the TRY database remains a key challenge and requires a coordinated approach to data mobilization and trait measurements. This can only be achieved in collaboration with other initiatives

    FeLebrities: a user-centric assessment of Federated Learning frameworks

    No full text

    FeLebrities: A User-Centric Assessment of Federated Learning Frameworks

    No full text
    Federated Learning (FL) is a new paradigm aimed at solving data access problems. It provides a solution by moving the focus from sharing data to sharing models. The FL paradigm involves different entities (institutions) holding proprietary datasets that, contributing with each other to train a global Artificial Intelligence (AI) model using their own locally available data. Although several studies have proposed methods to distribute the computation or aggregate results, few efforts have been made to cover on how to implement FL pipelines. With the aim of accelerating the exploitation of FL frameworks, this paper proposes a survey of public tools that are currently available for building FL pipelines, an objective ranking based on the current state of user preferences, and an assessment of the growing trend of the tool’s popularity over a one year time window, with measurements performed every six months. These measurements include objective metrics, like the number of “Watch,” “Star” and “Follow” available from software repositories as well as thirteen custom metrics grouped into three main categories: Usability, Portability, and Flexibility. Finally, a ranking of the maturity of the tools is derived based on the key aspects to consider when building a FL pipeline

    Crowdsearching Training Sets for Image Classification

    No full text
    The success of an object classifier depends strongly on its training set, but this fact seems to be generally neglected in the computer vision community, which focuses primarily on the construction of descriptive features and the design of fast and effective learning mechanisms. Furthermore, collecting training sets is a very expensive step, which needs a considerable amount of manpower for selecting the most representative samples for an object class. In this paper, we face this problem, following the very recent trend of automatizing the collection of training images for image classification: in particular, here we exploit a source of information never considered so far for this purpose, that is the textual tags. Textual tags are usually attached by the crowd to the images of social platforms like Flickr, associating the visual content to explicit semantics, which unfortunately is noisy in many cases. Our approach leverages this shared knowledge, and collects images spanning the visual variance of an object class, removing at the same time the noise by different filtering query expansion techniques. Comparative results promote our method, which is capable to automatically generate in few minutes a training dataset leading to an 81.41% of average precision on the PASCAL VOC 2012 dataset

    MiFL: Multi-Input Neural Networks in Federated Learning

    No full text

    Statistical Analysis of Personality and Identity in Chats Using a Keylogging Platform

    No full text
    Interacting via text chats can be considered as a hybrid type of communication, in which textual information delivery follows turn-taking dynamics, resembling spoken interactions. An interesting research question is whether personality can be observed in chats, similarly as happening in face-to-face exchanges. After an encouraging preliminary work on Skype, in this study we have set up our own chat service in which key-logging functionalities have been activated, so that the timings of each key pressing can be measured. Using this framework, we organized semi-structured chats between 50 subjects, whose personality traits have been analyzed through psychometric tests, and a single operator, for a total of 16 hours of conversation. On this data, we have observed that some personality traits are linked with the way we are chatting (measured by stylometric cues), by means of statistically significant correlations and regression studies. Finally, we have assessed that some of the stylometric cues are very discriminative for the recognition of a user in a identification scenario. These facts taken together could underlie that some personality traits drive us in chatting in a particular fashion, which turns out to be very recognizable
    corecore